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ORIGINAL ARTICLE 

DNA evidence for strong genetic stability and increasing 
heritability of intelligence from age 7 to 12 

M Trzaskowski\ J Yang^ PM Visscher^'^ and R Plomin^ 

Two genetic findings from twin research have far-reaching implications for understanding individual differences in the 
development of brain function as indexed by general cognitive ability (g, aka intelligence): (1) The same genes affect g throughout 
development, even though (2) heritability increases. It is now possible to test these hypotheses using DNA alone. From 1.7 million 
DNA markers and g scores at ages 7 and 12 on 2875 children, the DNA genetic correlation from age 7 to 12 was 0.73, highly similar 
to the genetic correlation of 0.75 estimated from 6702 pairs of twins from the same sample. DNA-estimated heritabilities increased 
from 0.26 at age 7 to 0.45 at age 12; twin-estimated heritabilities also increased from 0.35 to 0.48. These DNA results confirm the 
results of twin studies indicating strong genetic stability but increasing heritability for g, despite mean changes in brain structure 
and function from childhood to adolescence. 
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INTRODUCTION 

Although developmental research from childhood to adolescence 
reveals species-general changes in brain structure and function,^'^ 
much less is known about the development of individual 
differences within our species, which has been called 'one of 
the preeminent challenges of neuroimaging'.^ It is important to 
understand the developmental etiology of individual differences, 
because societal problems often involve individual differences — 
for example, why some children are slow to speak, to learn or to 
read. The description and causes of species' means are not 
necessarily related to the description and causes of variances 
within a species."^ Two well-replicated genetic findings from twin 
studies comparing monozygotic and dizygotic (DZ) twins suggest 
hypotheses at the level of individual differences in cognitive 
ability that may be relevant to neuroscience, to the extent that 
brain structure and function underlie cognitive outcomes. These 
twin-study findings involve general cognitive ability, which was 
labeled g by Spearman more than a century ago,^ but is 
commonly known as intelligence.^ g is the most researched 
cognitive trait in genetics^ and has important links with 
neuroscience.^'^ 

First, the heritability of g increases during development, even 
from childhood to adolescence.^^ This finding is counterintuitive 
to the extent that genetic effects are thought to be static, and 
environmental effects are expected to accumulate during 
development. The increasing heritability of g also seems at odds 
with the second genetic finding: The same genes largely affect g 
throughout development." For example, in a longitudinal twin 
analysis from childhood to adolescence, the genetic correlation 
was estimated as 0.96, although the 95% confidence interval for 
this estimate was 0.74-1.0.^^ The genetic correlation is literally the 
correlation between genetic effects on g at the two ages 



independent of heritability." The high genetic correlation 
implies that if a gene is found to be associated with g in 
childhood, the gene is also highly likely to be associated with g in 
adolescence. Later, we offer a hypothesis as to how heritability can 
increase when genetic effects are stable from age to age. 

These two genetic findings have not found much traction in the 
neurodevelopmental literature. This neglect might be due in part 
to a lack of attention to individual differences, but it might also be 
due to skepticism about the twin method, which relies on some 
major assumptions, most notably, equal environmental treatment 
of monozygotic and DZ twins." Quantitative genetic designs such 
as the twin method would no longer be needed if it were possible 
to identify all of the genes responsible for heritability.^^ However, 
it has proven more difficult than expected to identify genes for 
complex traits,^"^ including g,^^ which has led to the refrain of 
'missing heritability'.^ ^'^'^ Nonetheless, it is now possible to use 
DNA itself to estimate genetic variance and covariance in any 
sample of unrelated individuals, not just samples consisting of 
special family members such as twins or adoptees. The method, 
called genome-wide complex trait analysis (GCTA)^^ correlates 
genomic similarity across hundreds of thousands of single 
nucleotide polymorphisms (SNPs) with phenotypic similarity in a 
large sample of unrelated individuals.^^ This population-based 
DNA approach does not rely on the strong assumptions made in 
classical twin studies. GCTA compares similarity across hundreds 
of thousands of SNPs with phenotypic similarity pair by pair in a 
large sample of unrelated individuals. Although conventionally 
unrelated individuals only vary in their genetic similarity by a small 
amount, GCTA accumulates all the genotype - phenotype 
association signals using the massive information available in a 
matrix of thousands of individuals, each compared pair by pair 
with every other individual in the sample. GCTA has been used to 
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estimate genetic influence for height,^ ^ weight,^° psychiatric and 
medical disorders,^^"^^ personality^"^ and even economic and 
political preferences.^^ GCTA has also been applied to g in adults^^ 
and children.^^ These GCTA estimates of genetic influence, 
although substantial, have been lower than heritability typically 
found in twin studies of these traits. Using the 12-year data from 
the sample in the present report, GCTA and twin estimates of 
heritability were compared explicitly for several cognitive 
measures; the GCTA estimate of g was 35% and the twin 
estimate was 46%.^^ Precision in comparing GCTA and twin 
estimates is important because, as explained later, this comparison 
reveals important information about a trait's genetic architecture. 

This previous GCTA research involves univariate analysis in that 
it decomposes the phenotypic variance of a single trait into 
genetic and non-genetic components of variance. Recently, GCTA 
has been extended to bivariate analysis, which decomposes the 
phenotypic covariance between traits into components of 
covariance. The first preliminary attempt to extend GCTA to 
bivariate analysis reported a genetic correlation of 0.62 for g in 
childhood (age 11) and old age.^^ Here, we use a new bivariate 
GCTA method^^'^^ to test the hypotheses of strong stability and 
increasing heritability for g from age 7 to 12. We also compare 
GCTA estimates with those from a twin analysis based on the 
same sample at the same ages using the same measures. 



MATERIALS AND METHODS 

Sample 

The sample was drawn from the Twins Early Development Study (TEDS), 
which is a multivariate longitudinal twin-study that recruited over 1 1 000 
twin pairs born in England and Wales in 1994, 1995 and 1996.^°'^^ TEDS is 
representative of the UK population.^^ The project received approval from 
the Institute of Psychiatry ethics committee (05/Q0706/228), and parental 
consent was obtained before data collection. Individuals were included if 
their first language was English and they had no major medical or 
psychiatric problems. GCTA was conducted on g at ages 7 and 12 for 2875 
unrelated individuals in TEDS (only one member of a twin pair), of which 
1334 had g data at both ages. Twin model-fitting analyses of g at ages 7 
and 12 were conducted for 6702 TEDS twin pairs, of which 2269 pairs had 
g data at both ages. As expected for representative twin studies, the twins 
included similar numbers of monozygotic twins, same-sex DZ twins and 
opposite-sex DZ twins. 

Genotyping 

Although DNA is available for more than 12 000 TEDS participants, funds 
were available to genotype 3665 individuals (one member only per twin 
pair) on Affymetrix GeneChip 6.0 (Affymetrix Inc., Santa Clara, CA, USA) SNP 
genotyping arrays using standard experimental protocols as part of the 
WTCCC2 project. In addition to nearly 700000 genotyped SNPs, more than 
one million other SNPs were imputed using IMPUTE v.2 software (https:// 
mathgen.stats.ox.ac.uk/impute/impute. html).^^ DNA for 3152 individuals 
(1446 males and 1706 females) survived quality control criteria. Of these 
3152 individuals, 2875 had g scores at least at one age and 1344 had g 
scores at both ages. To control for ancestral stratification, we performed 
principal component analyses on a subset of 100000 quality-controlled 
SNPs after removing SNPs in linkage disequilibrium (/^>0.2).^^ Using the 
Tracy- Widom test,^^ we identified 8 axes with P<0.05, which were used 
as cova hates in GCTA analyses. 

Measures 

The measures and testing procedures have been described in detail for 
age 7^^ and 12.^^ At each age, a composite index of g was derived from 
two verbal tests and two non-verbal tests. At age 7, the two verbal tests 
consisted of the Similarities subtest and the Vocabulary subtest from the 
WISC-III-UK, and the two non-verbal tests were the picture completion 
subtest from the WISC-III-UK and the Conceptual Grouping subtest from 
the McCarthy Scales of Children's Abilities. At age 12, the verbal tests 
included the Information and Vocabulary subtests from the WISC-III-PI 
Multiple Choice test, and the two non-verbal reasoning tests were WISC-III- 
UK Picture Completion and Raven's Standard and Advanced Progressive 
Matrices. At age 7, testing was conducted by telephone as described 
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elsewhere;^^ at age 12, testing was conducted online.^^ For each cognitive 
measure at each age, scores were regressed on sex and age and 
standardized residuals were derived, ranked with random values given to 
tied data, and quantile normalized.^^'^^ Finally, total composites for g were 
created as unit-weighted means requiring complete data for at least three 
of the four tests. All the procedures were executed using R (www. 
r-project.org).'^° 

Statistical analyses 

Genome-wide complex trait analysis. The first step in GCTA is to calculate 
pairwise genomic similarity between all pairs of individuals in the sample 
using all genetic markers genotyped on the SNP array. Because GCTA is 
designed to estimate genetic variance due to linkage disequilibrium 
between unknown causal variants and genotyped SNPs from a sample of 
unrelated individuals in the population, any close genetic relatedness is 
eliminated; for this reason any individual whose genetic similarity is equal 
to or greater than a fourth cousin is removed (estimate of pairwise 
relatedness > 0.025). The essence of GCTA is to compare a matrix of 
pairwise genomic similarity to a matrix of pairwise phenotypic similarity 
using a random-effects mixed linear model. In univariate analysis, the 
variance of a trait can be partitioned using residual maximum likelihood 
into genetic and residual components. Detailed description of this method 
can be found in Yang, Lee et al.^^ and Yang, Benyamin et al?^ The bivariate 
method extends the univariate model by relating the pairwise genetic 
similarity matrix to a phenotypic covariance matrix between traits 1 and 2, 
allowing for correlated residuals.^^ The eight principal components 
described earlier were used as covariates in our GCTA analyses; as 
mentioned, all phenotypes were age- and sex-regressed before analysis. 

Twin modeling. The classical twin design and model-fitting is discussed 
elsewhere." We fit a bivariate twin model using OpenMx,^^ which 
provided a direct comparison with the bivariate GCTA. The correlated 
factor solution is the least restricted model allowing variables to correlate 
with one another via genetic, shared environment and non-shared 
environment. Because previous analyses of these data indicated 
nonsignificant differences in model-fitting results between males and 
females,^^'^^ we combined same-sex and opposite DZ twin pairs in order to 
increase the power of the analyses. Twin analyses limited to same-sex 
twins yielded highly similar results (available from the first author). 

RESULTS AND DISCUSSION 

Genetic stability 

As shown in Table 1, the GCTA genetic correlation between g at 
ages 7 and 12 was 0.73 (0.29 standard error, s.e.). Table 2 shows 
that the twin-study yielded a highly similar genetic correlation of 
0.75 (0.08 s.e.). The genetic correlation indexes the correlation 
between genetic effects on g at the two ages independent of 
heritability. That is, the genetic correlation can be high even if 
heritability is low. It is also possible to weight the genetic 
correlation by heritability in order to estimate the genetic 
contribution to the phenotypic correlation. The phenotypic 
correlation for g between ages 7 and 12 was 0.46 (0.02) for 
2408 children (one member randomly chosen from each twin pair) 
with g data at both ages. For GCTA, the genetic contribution to 
the phenotypic correlation was 0.25 (0.11), which is the GCTA 
genetic correlation weighted by heritability (that is, the product of 
the square roots of the GCTA heritabilities of g at the two ages). 
Another way of expressing this is as bivariate heritability, which is 
the proportion of the phenotypic correlation that can be 
attributed to genetic covariance. GCTA bivariate heritability was 
0.60 (that is, 0.25^0.42), indicating that 60% of the phenotypic 
correlation could be accounted for by genetic factors. The 
comparable twin-study estimate of the genetic contribution to 
the phenotypic correlation was 0.31 (0.03), yielding a bivariate 
heritability of 0.68. 

Increasing heritability 

Despite the substantial genetic correlation of 0.73 from age 7-12, 
GCTA estimates of genetic influence on g increased from 0.26 
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Table 1. Bivariate GCTA results (with standard errors) for general cognitive ability (g) from age 7 to 12^ 



(a) Genetic 

Genetic variance at age 7 Genetic variance at age 12 Genetic covariance between age 7 and 12 Genetic correlation between age 7 and 12 



0.26 (.1 7) 0.45 (0.1 4) 0.25 (0.1 1 ) 0.73 (0.29) 

(b) Environmental 

Environmental variance at Environmental variance at Environmental covariance between age 7 Environmental correlation between age 7 
age 7 age 12 and 12 and 12 



0.74 (0.1 7) 0.55 (0.1 4) 0.1 8 (0.1 1 ) 0.28 (0.1 5) 



Abbreviation: GCTA, genome-wide complex trait analysis. 

^GCTA incorporates full-information maximum likelihood that uses the full sample of 2875 individuals with data at either 7 or 12. However, the variance 
estimates at each age are based on individuals with data at that age (1908 at 7, 2329 at 12) and the covariance estimates are based on individuals with data at 
both ages (1344). 

h"he current version of GCTA does not report the environmental correlation or its standard error. The environmental correlation was derived here from the 
GCTA estimates using the following algorithm: C(e)_tr12 / (^yv(e)_tr1 *y^V(e)_tr2), whereas the standard error was calculated using: Var(re) = re * re * (VarVel/ 
(4*VerVe1) + VarVe2/(4*Ve2*Ve2) + VarCe/(Ce*Ce) + CovVel Ve2/(2*VerVe2) - CovVe1Ce/(VerCe) - CovVe2Ce/(Ve2*Ce)); SE(re) = 7[Var(re)] where re is the 
environmental correlation, Vel is the residual variance for trait 1, Ce is the residual covariance between two traits, VarVel is the sampling variance for Vel 
(residual variance for trait 1), VarCe is the sampling variance for Ce, CovVe1Ve2 is the sampling covariance between Vel and Ve2, and CovVelCe is the 
sampling covariance between Vel and Ce. 



Table 2. Bivariate twin model-fitting results (with standard errors) for general cognitive ability from age 7 to 12^ 


Genetic 

Genetic variance at age 7 


Genetic variance at age 12 


Genetic covariance between age 7 and 12 Genetic correlation between age 7 and 12 


0.36 (0.03) 


0.49 (0.04) 


0.31 (0.03) 0.75 (0.08) 


Shared environment (C) 
C variance at age 7 


C variance at age 12 


C covariance between age 7 and 12 C correlation between age 7 and 12 


0.31 (0.03) 


0.19 (0.03) 


0.12 (0.03) 0.48 (0.11) 


Non-shared environment (E) 
E variance at age 7 


E variance at age 12 


E covariance between age 7 and 12 E correlation between age 7 and 12 


0.33 (0.01) 


0.32 (0.01) 


0.03 (0.01) 0.09 (0.03) 


^OpenMx twin model-fitting incorporates full-information maximum likelihood that uses the full sample of 6702 pairs of twins with data at either 7 or 12. 
However, the variance estimates at each age are based on twin pairs with data at that age (5320 at 7, 4061 at 12), and the covariance estimates are based on 
twin pairs with data at both ages (2269). 



(0.17 s.e.) at age 7 to 0.45 (0.14 s.e.) at age 12, although the large 
standard errors indicate that the increase did not reach statistical 
significance. Heritability increased significantly in the twin model- 
fitting analyses, from 0.36 (0.03) at age 7 to 0.49 (0.03) at age 12. 
Thus, GCTA estimates account for 74% of the twin-study 
heritability estimate of g at age 7 and 94% at age 12. 

Why genetic stability but increasing heritability? 
In summary, GCTA confirms the twin-study hypotheses of strong 
genetic stability and increasing heritability. In other words, the 
same genes are largely (about 75%) responsible for genetic 
influence on g at age 7 and age 12, yet the effect of these genes 
(heritability) increases substantially from age 7 to 12. How is this 
possible? We hypothesize that the same genes affect g from age 
to age but heritability increases as children select their own 
environments that are correlated with their g-related genetic 
propensities,^ ° a process called genotype - environment 
correlation." This hypothesis makes three predictions. The first 
prediction is that g-related experiences will themselves show 
genetic influence, for which there is considerable evidence from 
twin studies."^^'"^"^ Second, the links between these experiences and 
g are expected to be mediated genetically, evidence which is 
beginning to emerge from twin studies."^^ The third prediction is 



that genetic links between g and experience should strengthen 
during development, but this has not yet been investigated. These 
genetic links are expected especially for experiences in which 
children are able to select or modify their environments in line 
with their genetic propensities, in contrast to environments 
that are passively imposed on children. Supportive evidence to 
date for this genotype — environment hypothesis relies on twin 
data, but GCTA can also be used to address these issues with DNA 
alone. 

Genetic architecture 

Our GCTA results clarify the genetic architecture of g in ways that 
are relevant to solving the 'missing heritability' puzzle that has 
emerged from the limited success of genome-wide association 
studies to identify the genes responsible for heritability."^^ Two of 
the major hypotheses to account for missing heritability are 
epistatic (nonadditive) genetic effects and rare variants, because 
genome-wide association research is limited to detecting additive 
genetic effects and genetic effects that can be tagged by the 
common SNPs used to date on commercially available DNA 
arrays.^ ^ Because GCTA is also limited in these same two ways, 
finding significant GCTA estimates of genetic influence provides 
strong evidence that current genome-wide association research 
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strategies can detect the majority of the missing heritability if 
samples are sufficiently large to provide power to detect 
associations of small effect size. As noted above, our GCTA 
estimates of genetic influence account for 74-94% of our twin- 
study heritability estimates, which implies that most of the 
missing heritability can be found with additive effects of common 
SNPs. The heritability that remains missing might be due to 
epistatic effects and rare variants. 

In our longitudinal genetic analyses from age 7 to 12, the GCTA 
estimate of genetic covariance is also somewhat lower than the 
twin-study estimate. As shown in Table 1, the genetic covariance 
for g between ages 7 and 12 — that is, the genetic contribution to 
the phenotypic covariance — is 20% lower for GCTA than for twins 
(that is, 0.25 for GCTA and 0.31 for twins). However, the GCTA 
genetic correlation of 0.73 is highly similar to the twin-study 
genetic correlation of 0.76. The likely reason is that GCTA genetic 
variance and covariance estimates are attenuated by imperfect 
linkage disequilibrium between causal variants and genotyped 
SNPs, but the GCTA estimate of the genetic correlation is 
unbiased, because the genetic correlation is derived from the 
ratio between genetic covariance and genetic variance. Because 
GCTA genetic variance and covariance estimates are biased to the 
same extent due to imperfect linkage disequilibrium, they cancel 
each other out in the calculation of the genetic correlation, leaving 
an unbiased estimate of the genetic correlation. 

Implications for brain structure and function 
To the extent that g indexes general brain function, the present 
results suggest hypotheses for the etiology of individual 
differences in brain development. The same genes can be 
expected to be responsible for individual differences throughout 
brain development despite the major mean changes that occur 
during development. The hypothesis of increasing heritability for 
individual differences in brain development points to genotype — 
environment correlation as the process by which genotypes 
become phenotypes. Importantly, the correspondence between 
GCTA and twin results indicates that special samples such as twins 
are no longer needed to test such genetic hypotheses in 
neurodevelopment — GCTA makes it possible to test them in any 
large sample of unrelated individuals. 
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